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Part  III.  Paper-pencil  Analogs  of  Laboratory  Performance  Tacts 

Haym  Kruglak 

University  of  Minnesota,  Minneapolis  lU,  Minnesota 
Abstract 

An  attempt  has  been  zaade  to  convert  laboratory  performance  tests  into  essay 
and  multiple  choice  items.  Preliminary  forms  of  the  tests  were  administered 
to  about  160  elementary  physics  students.  It  was  found  that  the  multiple- 
choice  was  the  least  difficult  of  the  three  tests;  the  performance — the  most 
difficult.  The  cozrelatlon  coefficients  between  relatively  ccmiplex  performance 
itans  and  their  paper-pencil  analogs  were  very  low.  The  correlations  between 
the  items  dealing  vrith  specific  skills  and  parallel  paper-pencil  testa  were 
low  to  moderate.  T>io  preliminary  study  supports  the  hypotiiesis  that  paper- 
pencil  tests  are  p or  substitutes  for  performance  examinations. 

Introduction 

Few  experienced  p’lysics  teachers  would  venture  to  predict  on  the  basis 
of  conventional  tests  whut  a student  will  accomplish  when  confronted  with 
insti*umonts,  apparatus  and  materials;  and  yet  the  widespread  use  of  paper- 
pencil  examinations  is  based  on  the  assumption  that  ability  to  solve  a 
problem  on  paper  is  highly  correlated  with  the  ability  to  solve  the  same 
problem  in  the  shop  or  laboratory. 

The  results  of  earlier  studies  ^ have  indicated  that:  (l) performance 

tests  appeared  to  measure  outcomes  other  than  those  sampled  by  the  conventional 
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achievemant  tests  in  physics  anti  (2)  paper-pencil  tests  dealing  with 
specific  skills  in  the  laboratory  were  only  slighdly  related  to  perforii.inca 
tests  containing  more  general  and  difficult  tasks.  If  at  were  possible  ii 
construct  paper-pencil  items  that  were  highly  correljitad  with  parallel 
performance  tasks,  then  the  measurement  of  laboratory  achievement  would  be 
greatly  simplified. 

The  Probleo 

The  major  problem  of  the  study  could  bo  stated  as  follows t To  what 
extent  is  the  ability  to  solve  a laboratory  problem  on  paper  related  to 
solva  the  same  problem  using  apparatus  and  materials?  The 

«same”  ii'.plies  parallel  content,  objective,  method,  conclusion  or  results. 

The  investigation  sought  the  answers  to  a number  of  sub-problems:  (1)  Is 

there  any  difference  between  the  essay  and  multiple-choice  tests  with 
analogous  items?  (2)  If  two  laboratory  tests  are  taken  :',>ccessively,  which 
produces  the  greatest  practice  effect  on  the  other,  the  parformancs  or 
paper-pencil  tost?  (3)  Are  there  any  laboratory  activities  which  lend 
themselves  to  evaluation  by  paper-pencil  analogs  better  than  other  activities? 

The  Investigation  was  carried  out  at  the  University  of  Llinnccota 
during  the  ’.Vinter  quarter  19$3*^U*  Since  many  of  the  experimental  conditions 
vote  far  from  ideal,  the  study  should  be  considered  m exploratory  and  its 
findings  far  from  definitive. 


The  Sxparimcntal  Procedure 

a.  Test  construction 

It  was  decided  to  construct  tests  that  would  sample  three  t/pes  of 
laboratory  ac’.iievejnent:  a complete  laboratory  problem  familiar  to  the  student 

from  his  laboratory  work  during  the  quarter;  an  original  task  unfamiliar  to 
the  3 .udent  but  involving  relatively  sijnple  instrumental  manj.p;ilations;  a 
group  of  specific  skills  and  techniques.  Each  of  the  three  jjarts  was  to  bo 
prepared  in  three  forms?  porfozTiancc,  essay  ani  short-answer,  and  multiple- 
choice.  The  preliminary  drafts  were  prepared  by  the  writer  and  Ur.  R.  V. 

Stuart.  The  items  were  criticized  by  Profs.  C.  N.  riall  and  0.  D,  Freier. 
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In  designing  the  itema  three  criteria  were  ured  as  guides. 

lo  The  items  should  be  representative  of  the  aoMevement  eomrioaly 
expoc+.ed  in  the  sophomore  pViyalcs  laboratory. 

2.  The  Itaos  should  cover  specific  as  well  as  genoral  skills. 

5.  The  paper-pencil  items  should  be  as  analogous  as  possible  to  the 
performance  tasks. 

After  a great  deal  of  rewriting  and  many  conferences  between  the  test 
constructors  and  critics,.,  the  final  drs.ft  of  the  tests  used  in  the  stucy 
was  acceptable  to  all  concerned.  The  essay  and  multiple-choice  forms  were 


mimeograpliedj  location  cards  were  made  for  the  performance  test.  Photographic 
reproductions  of  the  given  apparatus,  meter  faces,  wired  circuits,  etc.  were 
aultilithod  and  aasemulou  into  a booklet  uo  accompany  the  paper-pencil  tests. 

The  description  of  the  given  apparatus  and  the  statement  of  the  problem  were 
3,dsntical  for  the  throe  foruis. 

The  test  dealt  with  the  laboratory  aspects  of  Electricity.  Part  I 
called  for  the  measurement  of  an  iinknoivn  resistance  by  the  Wheatstone  bridge 
method.  In  Part  II  the  probLem  was  to  identify  concealed  circuit  elements 
inside  a three-terminal  box  by  using  a voltmeter  and  a dry  cell.  Several 
specific  skills  were  Included  in  Part  III:  reading  a multiple  range  meter, 
identifying  a wired  potentiometer  circuit,  identifirLAg  five  pieces  of 
electrical  apparatus  by  name,  symbol  and  function,  and  drawing  schematic 
diagrams  for  two  wired  circuits.  A major  difference  between  some  of  the 
performance  and  paper-pf»ncil  tasks  centered  about  the  collection  of  experimental 
data:  in  the  laboratory  the  otudent  was  pretty  much  on  his  own;  for  the  essay 
and  multiple  cho<ce  portion  of  the  test,  the  data  OA  1 U O aiWA* 

tabular  form  and  the  Instmment  readings  by  ^neans  of  photographs#  The 
statement  of  the  problem  in  Part  II  of  the  test  Is  reproduced  below#  Pig.  1 
accompanied  the  essay  and  multiple  choice  forms# 


« 


PA5T  II 


LOCATIJN:  2h 


TIME:  13  min. 


PlVE>:t  Box  with  3 teminalc  labeled  A,  B,  C.,  voltmeter. 


dry  cell,  connectara* 


Between  any  two  temiiiala  inside  the  box  there  are  the 


following  possible  circuit  elements t 


(a)  Single  resistor  of  10  to  ?0  ohms 

(b)  Single  low  x^slstsnce  coll  with  an  enf  of  about 


!•?  volts 


(c)  A heavy  copper  leal  (zero  resistance) 


(d)  An  infiJrlta  resistance  (open  circuit) 


The  resistancj  of  the  vr.'tmeter  is  about  200  ohmso 


PROBLEM t (l)  ^Aiich  of  the  eminals  are  connected  by 


concealed  circuit  elements? 


(2)  Identify  the  eleuents  (resistor,  battery,  etc.) 


inside  box  and  craw  a schematic  diagram  of  the 


network. 


HINTt  It  is  }X>8sible  to  con.iect  the  given  components  in 


any  manner  without  damt-ging  the  apoaratus.  For 


example,  some  of  the  pcscdble  circuits  are  shown  below. 


KOTEt  To  gat  cridit  you  must  sk/itch  schematic  diagram  of 


cirou5.t  L'sed,  record  the  data,  and  state  the  conclusions 


based  on  the  data. 


I0hen  instructor  signals,  move  to  location  35. 


Instert  Fig.  1 approximately  here 


b_2 Test  population 


Th*  test  pcpnlation  consisted  of  83  students  in  Phisics  5**a  course  for 


pFSDadies  •>  and  students  in  Physics  8,  raosti;,''  Institute  of  Technolog;y' 
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asajora.  The  subjects  in  Physics  5 had  completed  two  quarters  of  physics 
with  College  Physlco  hy  Searu  and  Zamansky  as  a textbook.  The  lecture 
mateirlal  included  topics  In  atachanics,  heat,  and  electricity  with  t^e 
exception  of  eleot-ronlc».  The  ntyslcs  8 textbook  was  Analytical  tixporlatmtal 
Physics  by  Lemon  and  Ference.  The  students  in  Physics  6 had  cr'ii,>Ieted 
mechanics,  heat  and  magnetostatics,  electrostatics,  el actrooagne tism  and 
Ohm *8  La>:y.  Both  groape  had  porforaed  six  experiments  in  vx'jLcjlujr'  of 
Physics  Laboratory  Manual  by  Wall  and  Levinet  Eleutric  and  Uagnetic  Fields, 
Joule's  Law,  Condenser  Capacitance,  Galvanomeber  S<.'nsltivity,  b<he  at  stone 
Bridge,  and  Potentiometer.  The  students  worked  in  the  laboratory  in  p irs 
and  submitted  a weekly  written  report  which  w a graded  by  the  laboratory 
instructor* 

c.  Teat  administration 

The  experimental  tests  were  adkainistered  during  the  last  labor itory 
period  of  the  winter  quarter.  On  the  day  of  the  test  the  students  rejxjrted 
to  A claasrooK  where  the  laboratory  instructor  divided  the  section  at  ranu(.>iii 
Into  two  g^upo*  One  grciup  followed  the  instructor  to  the  laboratory  and  took 
the  perfonaanca  test  during  the  first  hailf  of  the  period;  the  other  group 
remained  in  the  classroom  and  was  administered  the  paper-pencil  tests. 

During  the  second  half  of  the  period  the  testing  procedure  w;-s  reversed. 

Thus  all  students  were  required  to  take  the  perfonnanco  test;  one  half  were 
given  the  essay  form  and  the  remaining  half  the  luultlple-choic'e  test.  The 
administration  tine  for  each  best  wao  ^2  minutes,  with  26  minutes  alloted  for 
the  Wheatstone  bridge  problem  and  13  minutes  each  for  the  other  two  parts. 

Tlie  allcted  tine  appeared  bo  be  ample  for  the  paper-pencil  tests;  the  student.* 
were  somewhat  rushed  in  Part  IH  of  the  perfor".ance  examination.  The  order 
af  problem  presentation  was  also  randomized  in  all  the  three  forms  of  the  test* 
d«.  Test  scoring 

A detailed  scoring  key  war,  worked  out  jointly  by  Prof.  Vtall,  the  writer, 
the  laboratory  tupervisor,  an  the  five  laboratory  assistants  in  the  two 
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eourses*  Tho  teaching  aaalatanta  had  theaselves  tal'an  tho  paper-pencjJ.  and 
perfomance  testa  and  were  thcrcughl;;’  fasdliar  with  the  problena  and  the 
apparatas*  Each  assistant  graded  one  part  or  section  o.r  the  perfom;mce  test 
aa  well  as  its  essay  analog  for  all  the  students  in  both  courses.  TV\e  weight 
distribution  for  the  three  parte  weret  UO  points  for  the  ^‘heatstone  bridge; 

30  points  for  the  blade  boxj  30  points  for  the  miscellaneous  skills.  The 
usual  correction  for  guessing  was  applied  to  ths  aultiple~cho:ce  scores. 

Since  all  the  items  on  the  multiple  choice  had  nc  partial  credit  allowed, 
only  the  total  scores  on  each  peart  of  the  three  tests  were  oompa*able. 

Analysis  of  Data 

a.  Comparlaon  of  samples 

In  accordance  with  the  testing  procedure  there  wore  four  sanyles  in  e^ch 
of  the  two  course  populations;  B^-essa^'  first,  perfox^nance  seconl;  L -^ssay 
second,  performance  first;  Mj^-«ru3.tiple-cheico  first,  performance  second; 
]l2HErv'J.tiple~choice  second,  perfomance  first.  The  analysis  of  variance  wa^ 
applied  to  the  scores  on  the  performance  test  and  fin.al  examination  for  the 
preceding  quarter  in  each  of  the  two  coui'ses.  No  Btatletical3y  significa-ot 
differences  were  found  between  the  variences  and  the  means  of  each  fou." 
samples  and  therctfore  they  were  adjudge!  to  be  random  samples  fr^sa  the  rame 
normal  population  aa  far  aa  ability  in  physics  was  concerned.  Er.ch  sample  in 
Physics  5 was  compared  with  the  corresjx>nding  sample  in  Physics  8 on  the 
experimental  laboratory  tests.  No  significant  differences  between  the  varlnces 
and  the  means  were  found  on  the  performance  test.  Of  the  four  comparisons  o. 
the  paper-pencil  testa  oidy  one  v.'.  s significant.  The  scores  of  all  the  suble  ts 
in  Physics  5 on  ths  performance  tost  compared  with  the  scores  of  all  the 
Phyaies  6 students  on  this  tost.  No  statistically  significant  diffsi 
were  found  between  the  varirnces  or  tbs  means  of  the  two  groups  at  the  'xj» 
level.  Consequently,  it  vas  reasonable  to  assume  that  the  Physics  5 fnd  8 
groups  weire  of  conparabDa  ability  and  could  be  pooled  so  as  to  iijcrea:.e  the 
slss  of  the  samples. 


b«  Correlations  batween  the  beat  parts 

The  ccrrslatlons  between  the  three  parte  of  the  tests  ar«  shown  in 
Table  1. 


Insert  Table  1 approximately  here 


Most  of  the  correlation  coefficients  are  not  sigiAficantly  different  from  zero; 
those  that  are  have  very  low  values*  From  the  iiOgnltude  of  the  correlations 
it  was  concluded  that  the  three  parts  of  the  tes^s  were  essentially  Independent 
of  each  other  as  had  been  the  intent  of  the  test  constructors, 
c.  Comparison  of  the  total  scores  on  the  three  test  forms 

The  standard  deviations^  steansy  their  resjectlvo  differences,  and  the 
correlations  between  the  performance  and  paper-pencil  test  scores  are 
reproduced  in  Table  2. 


Insert  fable  2 approximately  here 


The  diffeironces  in  the  variabilities  and  averat^os  ior  a given  sample  could  be 

accounted  for  by  the  order  of  test  actelnistration*  ly  the  difference  in  the 

nature  of  the  two  tests,  or  by  both  factors*  It  is  'easonable  to  aissume  that 

there  are  no  measurable  differences  between  the  four  samples  on  any  of  the 

thro®  testa;  thus,  group  E had  an  average  of  U8.5  ani  a standard  devlatioa  of 

o 

lU*9  on  the  performance  test,  while  group  M had  an  average  of  h7»$  and  a 

2 

standard  deviation  uf  12*5  «''n  the  same  test;  the  differences  between  the  two 
respective  measure^  were  not  significant* 

A bjmparison  of  the  ave.'ages  of  the  three  groups  to  whom  the  throe  forms 
were  administered  fli'st  shows  a hiK'arch,"  of  difficulty*  The  averages  for  the 
multiple  cliolce,  essay  aiid  performance  were  65*6,  5ii*9  and  .»3*5  respect  ively. 
The  dirferonees  between  multiple»cholce  and  each  of  the  oth-r  •tii\>'i*®an» 
were  significant*  Tl;a  difference!  between  the  essay  and  perf  ormance  means  w s 
not  significant*  In  addition,  there  was  a significjint  ciifferuice  between  the 


s'^ndardi  daviationa  of  the  raaltiple^holce  and  perfonn  uice  teats.  These  two 
fonw  can  hardly  he  considered  equivalent. 

The  data  alec<  indicate  that  the  practice  effects  are  much  more 
pronounced  in  ttte  case  of  the  performance  tests,  i.e.,  taking  the  essay  test 
first  produces  a greater  difference  in  the  mean  of  the  performance  test  than 
the  effect  of  the  ra^^cree  order  of  presentation.  The  effect  of  the  n.iltlplo 
choice  Is  more  pronounced  than  that  of  the  essay • A likely  int-erpre^'-^ti  nn  i« 

that  the  essay  and  tttiltiple  choice  foxru#  contain  possible  s -ggestion#  for  the 

to 

answeraj  the  practice  effect  due.'*the  performance  experience  is  much  smaller 
with  reepeet  to  the  two'"papej>pencil  forms  and  is  almost  the  same  for  both. 

The  magnitu'^8  of  the  correlation  coefficients  support  the  above  observations; 
the  correlatione  are  hi^er  for  the  groups  who  took  the  paper-pencil  tests 
first.  However,  the  values  of  the  coefficients  substantiate  the  low  degree  c* 
relatloxrahip  between  performance  tasks  ann  their  paper-pencil  analogs. 
d,  Compartaon  of  part  scores 

The  statistical  summary  for  the  three  parts  of  the  tests  is  shown  in 
Table  3*  The  Wheatstone  bridge  problm  was  the  easiest  on  the  performance 
test  and  most  difficult  as  an  essay  question.  This  shows  that  the  reproduction 


Insert  Table  3 approximately  hare 


of  a previously  learned  skill  is  apparently  easier  than  the  grasp  of  the 
underlying  theory.  The  black  box  was  very  difficult  as  a practical  problem 
and  relatively  easy  as  a set  of  multiple  choice  items.  It  is  mere  than  likely 
that  the  evaluation  of  original  thinking  it*  limited  by  the  presence  of  the 
correct  answers  and  the  systematic  analysis  of  the  problem  by  a series  of 
items.  There  was  little  difference  betwo^  the  outcomes  on  the  miscellaneous 
skills  part  on  the  three  forms.  Actually,  in  terms  of  the  test  construction 
these  skills  could  bs  mors  easily  imitated  on  paper  than  the  other  two 
problems,  For  the  purpose  of  reatiing,  the  photograph  of  a meter  face  jl^  aljuost 


as  good  as  the  meter  fac<#  proi>«sr.  Again  the  correiationa  are  highest  for 
this  part  of  the  test. 

Conclusions 

In  vler  of  the  experimental  limitations  of  the  stup/  several  trenda. 
appear  to  be  supported  by  the  data* 

1*  Performance  tests  and  their  paper-pencil  analogs  differ  in  difficulty 
as  measured  by  mean  scores*  The  multiple  choice  form  differs  significantly 
from  either  the  essay  or  the  performance  test. 

2*  Thsre  is  a gain  on  each  test  caused  by  previous  administration  of  one 
of  the  other  forms*  The  gain  is  greater  for  the  pei^'orraance  test  than  for 
either  of  the  other  two  forms. 

3»  The  low  values  of  the  correlations  between  the  performance  test  scores 
and  their  paper-pencil  ”eqid.valents"  indicate  that  the  latter  are  at  best  only 
crude  approximations  to  the  evaluation  of  ability  to  deal  with  laboratory  materials 
and  apparatus. 

ha  The  relatively  small  differences  between  the  means  of  the  paper-pencil 
and  perfonnance  items  designed  to  evaluate  very  specific  pkilJs  suggest  that 
papeT-pancil  analogs  in  this  area  might  be  successfully  constructed  and  evaluated. 
However,  the  best  bechrJ.ques  for  approximating  the  real  situations  must  be 
used,i*e.,  photo{p*apha  of  appai*atus,  three  •■dimensional  drawings,  models,  etc* 

?*  The  multiple-choice  form  of  a laboratory  test  is  probably  the  least 
suitable  type  for  evaluating  originality* 


Summary 

Three  forms  of  a laboratory  test  in  Electricity  were  constructcrl  and 
administered  to  pre-^nedical  and  engineering  students  at  the  University  of 
liinnesota*  iTie  iteas  on  the  essay  and  multiple  choice  focins  were  made  -us 
analogous  as  wssible  to  the  corresjHonding  performance  tests.  Extensive  use 
was  made  of  piiol.ogx-ei^h!ie  tcOuaique  to  sinulato  the  actual  ].abor  tory  situations 
on  paper*  The  best  rclationshi;.;  was  obtaiiied  for  the  part  of  the  test  dealing 
with  specific  skills.  Tha  paper-pencil  tests  which  preceded 


Taljle  lo  CorrelatiOTa  betwssn  parta  of  the  iaboratoxs'  teatSo 
University  of  filnnaaota.  Physics  5 md  85  1953-Sho 

papex^wpanoiX  porfoxinanca 
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.37  .11  o32  .20  -oil  oU6*^  o37* 

Sg  1;7  oOU  0O6  o23  o23  .22  .23 
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Table  2a  Standard  devlatioEiSs  means,  (UfferenceB  end  oorrslatlan 
coeffioiente  for  three  laboratoay  testBi,  University  of 
itirmesota,  Plysioa  5 and  8,  1953-$ko 
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Uo9 

5.7** 

i h. 

I 

37 

«3 

6,2 

*1$ 

Faxd’oKraance  see<«d 

m 

7o3 

3*9 

lo^ 

06 

0$^ 

1 

22o6 

1 

1 

1 

5*U 

23o2 

i 

1 

f 

I 

10*  It 

X06 

29o2 

lu6* 

t 

1 

•-»o5  i 

1 

&»6 

2iio6 

f 

) 

i 

1 

rerfcawanco  first 

n 

lt4>6 

2o0** 

2ol 

12c0** 

1 

1 

4? 

o27  1 

Essay  seoortd 

in 

606 

5o6 

lo2* 

litol 

17ol 

5o2^ 

1 

i 

.$h"  i 

i 

1 

f 

iioU 

22o3 

i 

r 

X 

8*? 

25^.7 

lo7 

U.U* 

« * 
0 

r 

IO06 

3O0I 

: 

i 

1 

Uultlple  choice  first 

n 

9*2 

-ag. 

18,^ 

7.0** 

. is. 

57 

5ol 

o3l-  ] 

E 

Perfoonnanee  sacaad 

12*3 

11*$ 

4 

A 

r 

ni 

6.7 

,8 

21*2 

»2 

o75^*  ] 

5o9 

2I0U 

1 

I 

9o0 

L.0 

27o8 

2o$ 

4 

1 

80  0 

25o3 

1 

Performaiace  first 

II 

3o9 

3^8 

20o6^ 

1 

j 

«2 

iiU 

60 

.07  1 

Multiple  choice  second 

20o!i 

22oU 

i: 

ITT 

60O 

lo6^ 

17»9 

5,9^ 

1 . 

cli7*^  i 

li.h 

23.8 

! 

- 1 

i 

\ ^ I- 

aheatstcne  bridge;  I4O  points 

"***  Significant 

at  the  1%  level 

«- 

blaok  bez|  30  points 

* Significant 

at  the  $%  level 

i 

t III  - mlsceUaneous  slrfl.ls|  30  pointw. 


